39 research outputs found

    Exponential smoothing techniques on time series river water level data

    Get PDF
    The increasing of river water level usually happens during raining season.This event can lead to devastating flash flood, which would eventually cause damage to properties and possibly, loss of human life.Such event is also known as extreme event due to the nature of the data produced, which mostly consist of non linear pattern of data.The existence of nonlinear pattern and noise data greatly affect the quality of prediction result.Three exponential smoothing techniques have been investigated to study their ability in handling extreme river water level time series data, which are Single Exponential Smoothing Technique, Double Exponential Smoothing Technique and Holt’s Method.The techniques were performed on river water level data from three rivers in Perlis, Malaysia.From the experiments, it was found that all the three techniques have their own limitations in handling extreme data, with Double Exponential Smoothing Technique to perform better than its counterpart

    Fuzzy and smote resampling technique for imbalanced data sets

    Get PDF
    There are many factors that could affect the performance of a classifier.One of these factors is having imbalanced datasets which could lead to problem in classification accuracy.In binary classification, classifier often ignores instances in minority class.Resampling technique, specifically, undersampling and oversampling are the techniques that are commonly used to overcome the problem related to imbalanced data sets. In this study, an integration of undersampling and oversampling techniques is proposed to improve classification accuracy.The proposed technique is an integration between Fuzzy Distance-based Undersampling and SMOTE.The findings from the study indicate that the proposed combination technique is able to produce more balanced datasets to improve the classification accuracy

    A conceptual model of enhanced undersampling technique

    Get PDF
    Imbalanced datasets often lead to decrement of classifiers’ performance.Undersampling technique is one of the approaches that is used when dealing with imbalanced datasets problem.This paper discusses on the advantages and disadvantages of several undersampling techniques.An enhanced Distancebased undersampling technique is proposed to balance the imbalanced data that will be used for classification. The fuzzy logic has been integrated in the distance-based undersampling technique to resolve the ambiguity and bias issues

    Classification of machine learning engines using latent semantic indexing

    Get PDF
    With the huge increase of software functionalities, sizes and application domain, the difficulty of categorizing and classifying software for information retrieval and maintenance purposes is on demand.This work includes the use of Latent Semantic Indexing (LSI) in classifying neural network and k-nearest neighborhood source code programs. Functional descriptors of each program are identified by extracting terms contained in the source code.In addition, information on where the terms are extracted from is also incorporated in the LSI.Based on the undertaken experiment, the LSI classifier is noted to generate a higher precision and recall compared to the C4.5 algorithm as provided in the Weka tool

    Grid load balancing using enhance ant colony optimization

    Get PDF
    This study presents a new algorithm based on ant colony optimization for load balancing management in grid computing. The concentration is on improving the way ants search the best resources in terms of minimizing the processing time of each job and at the same time balancing the workload on available resources. An enhanced technique is proposed for the pheromone update activities. Single colony of ants is used for searching the best resources to process jobs. The credibility of the proposed algorithm was tested with other load balancing algorithim and results showed that the proposed algorithm was able to balance the load on the resources

    Resource management in grid computing using ant colony optimization

    Get PDF
    Managing resources in grid computing system is complicated due to the distributed and heterogeneous nature of the resources.Stagnation in grid computing system may occur when all jobs require or are assigned to the same resources which lead to the resources having high workload or the time taken to process a job is high.This research proposes an Enhanced Ant Colony Optimization (EACO) algorithm that caters dynamic scheduling and load balancing in the grid computing system.The algorithm consists of three new mechanisms that organize the work of an ant colony i.e. initial pheromone value mechanism, resource selection mechanism and pheromone update mechanism.The resource allocation problem is modeled as a graph that can be used by the ant to deliver its pheromone.This graph consists of four types of vertices which are job, requirement, resource and capacity that are used in constructing the grid resource management element.The proposed EACO algorithm takes into consideration the capacity of resources and the characteristics of jobs in determining the best resource to process a job.EACO selects the resources based on the pheromone value on each resource which is recorded in a matrix form.The initial pheromone value of each resource for each job is calculated based on the estimated transmission time and execution time of a given job. Resources with high pheromone value are selected to process the submitted jobs.Global pheromone update is performed after completion processing the jobs in order to reduce the pheromone value of resources.A simulation environment was developed using Java programming to test the performance of the proposed EACO algorithm against other ant based algorithm, in terms of resource utilization.Experimental results show that EACO produced better grid resource management solution

    Ant colony optimization algorithm for load balancing in grid computing

    Get PDF
    Managing resources in grid computing system is complicated due to the distributed and heterogeneous nature of the resources. This research proposes an enhancement of the ant colony optimization algorithm that caters for dynamic scheduling and load balancing in the grid computing system. The proposed algorithm is known as the enhance ant colony optimization (EACO). The algorithm consists of three new mechanisms that organize the work of an ant colony i.e. initial pheromone value mechanism, resource selection mechanism and pheromone update mechanism. The resource allocation problem is modelled as a graph that can be used by the ant to deliver its pheromone.This graph consists of four types of vertices which are job, requirement, resource and capacity that are used in constructing the grid resource management element. The proposed EACO algorithm takes into consideration the capacity of resources and the characteristics of jobs in determining the best resource to process a job. EACO selects the resources based on the pheromone value on each resource which is recorded in a matrix form. The initial pheromone value of each resource for each job is calculated based on the estimated transmission time and execution time of a given job.Resources with high pheromone value are selected to process the submitted jobs. Global pheromone update is performed after the completion of processing the jobs in order to reduce the pheromone value of resources.A simulation environment was developed using Java programming to test the performance of the proposed EACO algorithm against other ant based algorithm, in terms of resource utilization. Experimental results show that EACO produced better grid resource management solution

    Investigating teacher's integrity through association rule mining

    Get PDF
    The selection of teachers to attend trainings is currently done randomly, by rotation and not based on their work performance.This poses a problem in selecting the right teacher to attend the right course.Up until now, there is no intelligent model to assist the school management to determine the integrity level of teacher and assign them to the right training program.Thus, this study investigates the integrity traits of teacher using association rule technique with an aim, which can assist the school management to organize a training related to teachers’ integrity performance and to avoid sending the wrong teacher for the training.A dataset of Trainees Integrity Dataset representing 1500 secondary school teachers in Langkawi Island, Malaysia in the year 2009 were pre-processed and mined using apriori. Mining knowledge was analyzed based on demographic and integrity trait of teacher.The finding indicates that adaptability and stability are the weakest integrity trait among teachers.Teachers from the age group of 26 - 30 years are found to have lower integrity performance.However, other demographic factor such as gender, race, and grade position of teachers were not able to reflect their low integrity level in this study.Finally, this study produces a component of trainee selection module which can be used as guideline for school management to propose suitable training programs for teacher to improve their integrity mainly on adaptability and stability traits

    Fuzzy distance-based undersampling technique for imbalanced flood data

    Get PDF
    Performances of classifiers are affected by imbalanced data because instances in the minority class are often ignored. Imbalanced data often occur in many application domains including flood. If flood cases are misclassified, the impact of flood is higher than the misclassification of non-flood cases.Numerous resampling techniques such as undersampling and oversampling have been used to overcome the problem of misclassification of imbalanced data.However, the undersampling and oversampling techniques suffer from elimination of relevant data and overfitting, which may lead to poor classification results.This paper proposes a Fuzzy Distance-based Undersampling (FDUS) technique to increase classification accuracy. Entropy estimation is used to generate fuzzy thresholds which are used to categorise the instances in majority and minority classes into membership functions. The performance of FDUS was compared with three techniques based on Fmeasure and G-mean, experimented on flood data. From the results, FDUS achieved better F-measure and G-mean compared to the other techniques which showed that the FDUS was able to reduce the elimination of relevant data

    Topic identification method for textual document

    Get PDF
    Abstract— Topic identification is a crucial task for discovering knowledge from textual document. Existing methods for topic identification suffer from word counting problem as they depend on the most frequent terms in the text to produce the topic keyword.Not all frequent terms are relevant. This paper proposes a topic identification method that filters the important terms from the preprocessed text and applied term weighting scheme to solve synonym problem.A rule generation algorithm is used to determine the appropriate topics based on the weighted terms.The text document used in the experiment is the English translated Quran.The topics identified from the proposed method were compared with topics identified using Rough Set and domain experts. From the findings, the proposed topic identification method was consistently able to identify topics that are mostly close to the topics that have been given by Rough Set and the experts.The result from the comparison proved that the proposed method was able to be used to capture topics for textual documents
    corecore